Optimal SNR Model Selection in Multiple-Model Based Speech Recognition System
نویسنده
چکیده
In the multiple-model based speech recognition system, multiple HMM models corresponding to different types of noise signals and SNR values are trained and the one model which is most close to the input speech is selected for recognition. In the previous research on the multiplemodel based speech recognition, it has been thought that the best performance can be obtained by selecting the HMM model which is most similar in SNR values to the input speech. But, from our experimental results, it has been found that better performance can be obtained when there is some mismatch between the SNR values of input speech and the selected HMM model. In this paper, we experimentally determined the optimal HMM models corresponding to the SNR values of the input speech in the multiple-model based speech recognizer. From the recognition experiments on Aurora 2 database, we could see far better recognition results compared with the conventional multiple-model based speech recognizer by using the experimentally determined optimal HMM models.
منابع مشابه
Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution
Multiple-model based speech recognition (MMSR) has been shown to be quite successful in noisy speech recognition. Since it employs multiple hidden Markov model (HMM) sets that correspond to various noise types and signal-to-noise ratio (SNR) values, the selected acoustic model can be closely matched with the test noisy speech, which leads to improved performance when compared with other state-o...
متن کاملAnalysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
This paper deals with the analysis and optimization of a speech command recognition system (SCRS) trained on Czech telephone database Speechdat(E) for use in a selected noisy environment. The SCRS is based on hidden Markov models of context dependent phones (triphones) and mel-frequency cepstral coefficients analysis of speech (MFCC). The main aim is to analyze and to search for the optimal set...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملLip-reading from parametric lip contours for audio- visual speech recognition
This paper describes the incorporation of a visual lip tracking and lip-reading algorithm that utilizes the affine-invariant Fourier descriptors from parametric lip contours to improve the audio-visual speech recognition systems. The audio-visual speech recognition system presented here uses parallel hidden Markov models (HMMs), where a joint decision, using an optimal decision rule, is made af...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012